A linear time algorithm for Shortest Cyclic Cover of Strings
نویسندگان
چکیده
Merging words according to their overlap yields a superstring. This basic operation allows to infer long strings from a collection of short pieces, as in genome assembly. To capture a maximum of overlaps, the goal is to infer the shortest superstring of a set of input words. The Shortest Cyclic Cover of Strings (SCCS) problem asks, instead of a single linear superstring, for a set of cyclic strings that contain the words as substrings and whose sum of lengths is minimal. SCCS is used as a crucial step in polynomial time approximation algorithms for the notably hard Shortest Superstring problem, but it is solved in cubic time. The cyclic strings are then cut and merged to build a linear superstring. SCCS can also be solved by a greedy algorithm. Here, we propose a linear time algorithm for solving SCCS based on a Eulerian graph that captures all greedy solutions in linear space. Because the graph is Eulerian, this algorithm can also find a greedy solution of SCCS with the least number of cyclic strings. This has implications for solving certain instances of the Shortest linear or cyclic Superstring problems.
منابع مشابه
Linear Time Inference of Strings from Cover Arrays using a Binary Alphabet
Covers being one of the most popular form of regularities in strings, have drawn much attention over time. In this paper, we focus on the problem of linear time inference of strings from cover arrays using the least sized alphabet possible. We present an algorithm that can reconstruct a string x over a two-letter alphabet whenever a valid cover array C is given as an input. This algorithm uses ...
متن کاملLinear Time Inference of Strings from Cover Arrays Using a Binary Alphabet - (Extended Abstract)
Covers being one of the most popular form of regularities in strings, have drawn much attention over time. In this paper, we focus on the problem of linear time inference of strings from cover arrays using the least sized alphabet possible. We present an algorithm that can reconstruct a string x over a two-letter alphabet whenever a valid cover array C is given as an input. This algorithm uses ...
متن کامل[hal-00742061, v1] Efficient Seeds Computation Revisited
The notion of the cover is a generalization of a period of a string, and there are linear time algorithms for finding the shortest cover. The seed is a more complicated generalization of periodicity, it is a cover of a superstring of a given string, and the shortest seed problem is of much higher algorithmic difficulty. The problem is not well understood, no linear time algorithm is known. In t...
متن کاملEfficient Seeds Computation Revisited
The notion of the cover is a generalization of a period of a string, and there are linear time algorithms for finding the shortest cover. The seed is a more complicated generalization of periodicity, it is a cover of a superstring of a given string, and the shortest seed problem is of much higher algorithmic difficulty. The problem is not well understood, no linear time algorithm is known. In t...
متن کاملA New Algorithm for the Discrete Shortest Path Problem in a Network Based on Ideal Fuzzy Sets
A shortest path problem is a practical issue in networks for real-world situations. This paper addresses the fuzzy shortest path (FSP) problem to obtain the best fuzzy path among fuzzy paths sets. For this purpose, a new efficient algorithm is introduced based on a new definition of ideal fuzzy sets (IFSs) in order to determine the fuzzy shortest path. Moreover, this algorithm is developed for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Discrete Algorithms
دوره 37 شماره
صفحات -
تاریخ انتشار 2016